Walk, Not Wait: Faster Sampling Over Online Social Networks

نویسندگان

  • Azade Nazi
  • Zhuojie Zhou
  • Saravanan Thirumuruganathan
  • Nan Zhang
  • Gautam Das
چکیده

In this paper, we introduce a novel, general purpose, technique for faster sampling of nodes over an online social network. Specifically, unlike traditional random walks which wait for the convergence of sampling distribution to a predetermined target distribution a waiting process that incurs a high query cost we develop WALK-ESTIMATE, which starts with a much shorter random walk, and then proactively estimate the sampling probability for the node taken before using acceptance-rejection sampling to adjust the sampling probability to the predetermined target distribution. We present a novel backward random walk technique which provides provably unbiased estimations for the sampling probability, and demonstrate the superiority of WALK-ESTIMATE over traditional random walks through theoretical analysis and extensive experiments over real world online social networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Leveraging History for Faster Sampling of Online Social Networks

With a vast amount of data available on online social networks, how to enable efficient analytics has been an increasingly important research problem. Many existing studies resort to sampling techniques that draw random nodes from an online social network through its restrictive web/API interface. While almost all of these techniques use the exact same underlying technique of random walk a Mark...

متن کامل

Comparative Study of Sampling Methods for Online Social Networks

The properties of online social networks are of great interests to the general public as well as IT professionals. Often the raw data are not available and the summaries released by the service providers are sketchy. Thus sampling is needed to reveal the hidden properties and structure of the underlying network. This thesis conducts comparative studies on various sampling methods, including Ran...

متن کامل

A centralized privacy-preserving framework for online social networks

There are some critical privacy concerns in the current online social networks (OSNs). Users' information is disclosed to different entities that they were not supposed to access. Furthermore, the notion of friendship is inadequate in OSNs since the degree of social relationships between users dynamically changes over the time. Additionally, users may define similar privacy settings for their f...

متن کامل

Interpersonal Trust in Online Scientific Social Networks: Causes and Results

Background and Aim: This study tends to investigate the reasons of interpersonal trust and the results of trust in online scientific social networks. Methods: The applied Research has been used cluster sampling to collect data. The study population consisted of Shiraz university and Persian Gulf university faculties. A sampling of 269 person was determined by Morgan table according to whole pop...

متن کامل

Sampling Online Social Networks by Random Walk with Indirect Jumps

Random walk-based sampling methods are gaining popularity and importance in characterizing large networks. While powerful, they suffer from the slow mixing problem when the graph is loosely connected, which results in poor estimation accuracy. Random walk with jumps (RWwJ) can address the slow mixing problem but it is inapplicable if the graph does not support uniform vertex sampling (UNI). In ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2015